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1  Introduction 


A  continuing  element  in  divide- and- conquer  algorithms  descended  from  Cup- 
pen’s  [4,  5]  is  the  solution  of  an  eigensystem  that  differs  from  diagonal  by 
a  rank-1  perturbation.  Let  D  =  diag  (^i,  •  •  • ,  Sn)  be  a  real  diagonal  matrix, 
p  a  nonzero  real  number,  and  2;  =  (('i,  •  •  • ,  a  real  vector;  the  perturbed 
system’s  matrix  is 

D+yz^.  (1) 

Its  eigenvalues  are  the  zeros  x  =  Ai,  •  •  • ,  A„  of  the  secular  function  (see  [7]) 

n  a2 

M  =  (2) 

j=i  j 

and  the  eigenvector  corresponding  to  each  A^  is  parallel  to 

{D  -  Xkiy^z.  (3) 

(Here  /  is  the  n  X  n  identity  matrix.)  Numerical  difficulties  arose  when 
these  were  not  computed  accurately  enough  to  be  orthogonal,  but  recent 
work  [11,  9]  has  overcome  these  difficulties.  Ours  is  a  small  contribution  to 
solving  the  secular  equation  f(x)  =  0  as  accurately  as  necessary  but  faster 
than  before.  Certain  details  necessary  for  robustness  in  the  code  will  be 
discussed  too.  Inattention  to  such  details  can  cause  accidents  like  Division 
by  Zero,  which  the  author  has  encountered  while  testing  others’  programs. 

The  paper  is  organized  as  follows:  In  Section  2,  we  study  various  his¬ 
toric  ways  to  rationally  interpolate  the  secular  function  (2)  and  then  develop 
three  different  schemes  for  solving  the  equation  f(x)  =  0,  two  of  which  are 
essentially  due  to  [3]  and  the  other  one  is  new  and  fastest  among  the  three. 
The  way  to  interpolate  f(x)  in  Section  2  is  not  always  efficient  because,  in 
which,  attention  is  largely  paid  to  the  positions  of  two  nearby  poles.  Sec¬ 
tion  3  gives  a  closer  look  into  cases  when  attention  has  to  be  paid  not  only 
to  the  positions  of  most  relevant  poles  but  also  to  weights  over  particular 
poles.  Important  implementation  issues  like  securing  initial  guesses  and  se¬ 
lecting  the  best  scheme  are  discussed  in  detail  in  Sections  4  and  6.  Numerical 
examples  with  detailed  explanation  are  given  in  Section  7.  Discussions  of 
various  stopping  criteria  and  justihcations  of  our  proposed  stopping  criteria, 
are  presented  in  Section  5.  Numerical  examples  with  detailed  explanation 
are  given  in  Section  7.  Section  8  presents  our  conclusions. 
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Before  we  proceed,  let’s  make  some  convenient  assumptions.  Without 
loss  of  generality,  we  assume  that 

<  ^2  <  •  •  •  <  ^717  and  every  (j  ^  0. 

Otherwise,  deflation  would  give  a  new  matrix  of  form  (1)  with  a  lower  di¬ 
mension  [4,  5,  10].  Then  it  is  easy  to  see  that  j{x)  has  precisely  n  zeros  Aj-, 
one  in  each  open  interval  and  one  to  the  right  of  if  p  >  0  or 

to  the  left  of  otherwise.  To  simplify  our  presentation,  in  the  following  we 

T 

assume  p  >  0,  and  set  Then  the  n  eigenvalues  Ai,  •  •  •,  A„ 

of  the  matrix  (1)  satisfy 

8i  A^'  ^  7  ^  —  17  *  *  *  7 

2  Secular  Equation  Solvers  (I) 

One  obvious  method  to  solve  the  secular  equation  f{x)  =  0  is  Newton’s 
method.  However,  as  argued  in  [3],  since  f{x)  is  a  rational  function  having 
poles  at  ^1,  •  •  • ,  Newton’s  method,  based  upon  a  local  linear  interpolation, 
may  not  be  a  good  method.  Bunch,  Nielsen  and  Sorensen  [3]  proposed  a 
method  based  upon  rational  osculatory  interpolation.  A  major  drawback  of 
their  method  is  that  it  takes  too  many  iterations  in  certain  circumstances, 
as  we  shall  see. 

The  following  subsection  explores  three  rational  approximations  to  f(x), 
from  which  we  will  develop  three  iteration  schemes  to  solve  the  equation 
f(x)  =  0.  One  of  them  we  call  the  Middle  Way  is  new  and  fastest.  Most  of 
material  in  this  section  is  presented  for  historic  reasons  and  for  comparisons. 

2.1  Rational  Osculatory  Interpolations 

In  this  section,  we  explore  the  osculatory  interpolation  of  the  secular  function 
fix)  in  (2)  by  a  combination  of  the  following  two  kinds  of  simple  rational 
functions: 

F{x]p^q)^=  — - —  and  Gia;;  r,  s)  r ^ — .  (4) 

p  —  X  0  —  X 

They  will  be  used  in  conjunction  with  a  partition  of  the  secular  function 
f{x)  =  p  +  tpkix)  +  i’kix)  where,  for  A;  =  1,  2,  •  •  • ,  or  ra, 

fc  p2  ri  p2 

=  Y.  (5) 

2=1  ^  3=k+l  ^ 
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(Convention  sets  (jjnix)  =  0.)  The  choice  of  k  will  depend  upon  which 
eigenvalue  is  being  computed.  It  is  easy  to  see  that  for  Sk  <  x  < 

-00  <  'lpk(x)  <  0  <  (f)k(x)  <  +00. 

For  any  approximation  y  to  A^,  we  shall  approximate  each  of  'ipkix)  and 
4>k{x)  by  a  simpler  form  F  oi  G  chosen  to  match  tpk  and  in  value  and 
derivative  at  x  =  y,  in  other  words,  we  perform  osculatory  interpolation. 

So  far,  nothing  we  have  said  differs  from  past  practice  [3,  5,  9].  What 
will  be  novel  will  be  the  way  we  choose  which  of  F  and  G  should  be  used 
with  each  of  tpk  and  (pk]  in  fact,  we  shall  see  that  F  is  best  not  used  at  all. 


2.1.1  Two  Ways  to  Rationally  Interpolate  'ipkix) 

Let  y  be  a  hxed  approximation  to  A^  somewhere  between  6k  and  6k-\-i.  Now 
we  have  a  choice.  We  can  choose  parameters  p  and  q  in  F(x-,p,  q)  so  that 

F{y;  p,  q)  =  ^k{y),  F\y;  p,  q)  =  'ip'kiy). 

Or  we  can  choose  ^  to  match  the  pole  of  'ipkiF)  next  to  y  (and  A^)  and 
then  choose  parameters  r  and  s  in  G{y]  6k,  r,  s)  so  that 

G{y,  6k,  r,  s)  =  iJk{y),  G'(y,  6k,  r,  s)  =  i^'kiy). 


Here  are  the  formulas  for  the  parameters: 
•  In  F(x-,p,q)  =  q/(p  —  x). 


In  G(x-,  6k,  r,s)  =  r  +  s/(6k  -  x). 


r  = 


5  = 


y  +  My)/i^kiy) 

(6) 

.  k  y2 

^  V  6- 

i’'kiy)U^^^-y^" 

i’kiyf ! 'i’kiy)  >  o. 

(7) 

-  x). 

i^kiy)  -  {h  -  yWkiy) 

(8) 

{h  -  yfyy)  >  0. 

(9) 
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In  fact,  Cl  <  s  <  C|;  if  A;  >  1,  E  -  v)  =  C’k-iiy)  <  r  <  0;  and  if 

j=i  j=i 

A;  =  1,  r  =  0. 

Let  these  assignments  to  p,  g,  r,  s  hold  throughout  the  rest  of  this  §  2.1.1. 
Here  are  some  of  the  properties  of  the  two  osculatory  interpolating  functions 
that  our  subsequent  analysis  will  exploit. 

Proposition  1  8i<p<8k<y. 

This  proposition  indicates  that  the  pole  p  of  F(x-,p,q)  lies  farther  from 
y  than  6k  does.  Therefore  F(x-,p,  q)  may  approximate  'C’kix)  poorly  between 
its  pole  6k  and  y.  This  happens  when  Ck  is  relatively  small,  in  which  case 
\k  lies  very  close  to  6k,  and  therefore  y  will  do  likewise. 

Since  the  interpolating  rationals  are  unique,  as  we  can  see  from  the  above 
equations,  they  are  identically  equal  to  'C’kix)  \i  k  =  1. 

Theorem  1  If  k  >  2,  then  for  6k  <  x  <  +oo  and  x  fz  y 

G{x;6k,r,s)  <  ■fkix)  <  F{x;p,q). 

Proof:  The  inequality  F(x-,p,q)  >  'fkix)  for  6k  <  x  <  +oo  was  proved 
by  [3].  Now  we  prove  G(x-,6k,r,s)  <  'fkix)  for  6k  <  x  <  +oo  in  a  similar 
way. 

Let  g(x)  =  G(x-,6k,r,s)  —  ifkix).  It  suffices  to  prove  g{x)  <  0  for  6k  < 
X  <  +00.  To  this  end,  set 


k 

h(x)  =  g(x)  -  x). 

j=i 


(10) 


Then 


h{x)  =  r'[[{6j-x)  +  {s-Cl)'[[{6j-x) 

j=i  j=i 

+ X] n  “  ®)’  (11) 

8  =  1  j  =  lj+8' 

is  a  polynomial  of  degree  k.  It  is  easy  to  verify  that  h(y)  =  0  and  h'(y)  =  0, 
so  we  have 

k-2 

h{x)  =  fi{x-yfW{x--ij),  (12) 

j=i 
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for  some  f3,  7^.  (In  fact,  (3  =  (  — I)^r,  from  (11).) 

We  claim  that  7j  <  Sk  for  1  <  j  <  k  —  2.  The  zeros  of  h(x)  are  the  zeros 
of  g(x).  There  is  a  zero  of  g(x)  in  each  interval  (Sj,  ^j+i)  for  j  =  1,  ■  ■  ■  ,k  —  2. 
Therefore  h(x)  has  A;  —  2  of  its  zeros  less  than  As  y  >  Sk,  we  have  proved 
our  claim. 

Since  /3  =  (  — l)^r  and  (from  (8))  r  <  0,  so  from  (12)  sign(h(a;))  = 
(  — 1)^“^  when  6k  <  x.  On  the  other  hand,  for  6^  <  x  <  +00,  we  hnd  from 
(10)  that  sign(^(a;))  =  sign  (h(a;))(  — 1)^  =  —1,  which  means  g{x)  <0.  ■ 

2.1.2  Two  Ways  to  Rationally  Interpolate  (jjkix) 

Similarly,  we  can  perform  osculatory  interpolations  of  (jjkix)  by  F(x-,p,q) 
and  by  G(x-,  r,  s),  respectively,  such  that 

F{y;  P,  q)  =  F\y;  p,  q)  =  <(>(,(y); 

G{y,  h,  r,  s)  =  4>k{y),  G'{y,  4,  r,  s)  =  (t>'k{y)- 
It  is  easily  verihed  that 


p  =  y  +  My)/(l^kiy) 

(13) 

.  n  72 

(14) 

q  =  Myf/ykiy)  >  o, 

(15) 

r  =  fkiy)  -  {h+i  -  yWkiy) 

(16) 

(17) 

s  =  (4+1  -  yffkiy)  >  0. 

(18) 

Moreover  C|+i  <  s  <  YJj=k+i  k  <  <  r  <  fk+G 

and  if  A;  =  ra  —  1 , 

r  =  0. 

F  and  G  are  identically  equal  to  (jjkix)  \i  k  =  n  —  1. 

Proposition  2  y  <  6k-\-i  <  p  <  6^. 

Proposition  2  indicates  the  same  phenomenon  as  we  observed  in  Propo¬ 
sition  1. 

Theorem  2  If  k  <  n  —  2,  then  for  y  fz  x  <  6^+1 

F(x-,p,q)  <  fkix)  <  G{x;6k+i,r,s). 
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2.2  Solving  f{x)  =  0 

Three  ways  come  to  mind  to  find  the  A;th  zero  of  the  secular  function  (2) 
by  combining  different  rational  osculatory  interpolating  functions  to  'ipkix) 
and  (jjkix)  studied  above. 

The  first  one  we  are  going  to  study  is  the  one  in  [3],  and  we  describe  it 
as  Approaching  from  the  Left  because  the  algorithm  will  produce  a  sequence 
of  monotonically  increasing  approximations  to  provided  the  initial  guess 
is  between  6k  and  A^.  We  reproduce  it  here  for  completeness. 

We  describe  the  second  one  as  Approaching  from  the  Right  for  the  same 
reason. 

The  third  iteration  scheme  is  a  new  and  fastest  one  which  we  shall  call 
the  Middle  Way. 

Assume,  we  have  a  guess  y  to  A^  (with  6k  <  y  <  ^fc+i);  Later,  in  Sec¬ 
tion  4,  we  shall  discuss  how  to  choose  such  a  y. 


2.2.1  Approaching  from  the  Left 


Assume  for  the  moment  1  <  A;  <  ra. 

To  approach  from  the  left,  we  interpolate  f’kix)  by  F(x-,p,g)  and  fkix) 
by  ^(a;;  r,  s),  where  p,  g,  r  and  s  are  determined  by  Equations  (6),  (7), 
(16)  and  (18).  Instead  of  solving  p  -\-  f’kix)  +  fkix)  =  0,  we  solve 


P  + 


q 

P  —  X 


s 


=  0 


(19) 


for  X.  Equation  (19)  has  two  roots  x  of  which  just  one  lies  between  p  and 
6k-\-i',  that  one  is  our  new  approximation  y  +  p.  Set  =  6k-\-i  —  y.  Then 

the  correction  77  to  g  is 


where 


p 


a  —  a^  —  Abe 

¥c 

2b 

a  a^  —  45c 


if  a  <  0, 
if  a  >  0, 


(20) 

(21) 


a 


b 

c 


f{y)  A/,+1 


ii— 


^k+ifiv) 


'>Pk{y) 


i’'kiyy 

p+r  =  p  +  fkiy)  -  ^k+if'k{y)- 


4>'k{y) 

y'kiy) 
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Eventually,  as  f(y)  0,  we  find  a  >  0,  so  (21)  will  be  used  more  often  than 

(20). 

The  case  for  k  =  n  is  special  since  =  0.  Solving  p  +  =  0  gives 


Tf  =  X  —  y  = 


P  +  '>Pn{y) 
pi^niy) 


'>Pn{y) 


f{y)'>Pn{y) 

f'iy)p 


Regarding  the  foregoing  scheme,  we  have 


(22) 


Theorem  3  If  H  <  y  <  then  r]  >  0  and  y  <  y  +  p  <  A^;  if 
I'k  <  y  <  H+i,  then  p  <  0  and  y  +  p  <  Xk  <  y. 

Proof:  It  is  clear  from  Theorems  1  and  2.  ■ 

Theorem  3  says,  starting  with  an  initial  guess  which  is  less  than  Xk,  the 
foregoing  scheme  will  yield  a  seguence  of  approximations  converging  mono- 
tonically  upwards  to  X^;  on  the  other  hand  if  the  initial  guess  exceeds  Xk  the 
first  increment  p  will  be  negative  enough  to  make  the  next  approximation 
less  than  Xk,  and  all  later  approximations  will  go  upwards  to  Xk  again. 

Remark:  The  above  statement  is  true,  generally,  except  for  one  case 
when  the  initial  guess  exceeds  Xk  so  much  that  the  next  approximation  falls 
below  Sk-  Recall  that  we  can  only  guarantee  p  <  y  +  p  <  Sk+i  while  p  <  hk 
by  Proposition  1.  The  author  has  encountered  examples  in  which,  because 
of  inaccurate  initial  guess  y,  the  next  approximation  by  Approaching  from 
the  Left  went  indeed  below  hk-  Care  must  be  taken  to  prevent  that  from 
happening. 


2.2.2  Approaching  from  the  Right 

Reversing  to  the  choice  of  interpolating  rationals  for  f’kix)  and  fkix)  in 
Approaching  from  the  Left,  we  now  interpolate  f’kix)  by  G(x-,Sk,r,s)  and 
4>k{x)  by  F{x-,p,g),  where  p,  g,  r  and  s  are  determined  by  Equations  (8), 
(9),  (13)  and  (15).  Set  Ak  =  Sk  —  y.  Then  the  correction  p  to  y  is,  again, 
given  by  (20)  and  (21),  but  with 


a 

b 

c 


fiy) 

Afc/(y) 


j  -  Akfkiy) 
\  fkiy)) 

My) 


MM 

p  +  r  =  p  +  fjkiy)  -  AkMy)- 


My) 

My) 
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From  which  we  see  eventually  a  >  0. 

When  k  =  n,  we  partition  and  interpolate  the  secular  function  f(x)  the 
same  as  we  did  for  the  case  k  =  ra  —  1,  but  choose  our  new  approximation  to 
Xn  to  be  the  root  of  p  +  G(x-,  x,  s)  +  F(x-,p,  g)  =  0  which  lies  between 
Sn  and  6n+i  if  any.  It  is  easy  to  see  that  F(x-,p,q)  =  (^n-iix),  and  there  is 
such  a  root  provided  p  +  r  >  0.  The  correction  p  to  p  is 


a  +  W  —  Abe 

V  =  -  if  «>0, 


2c 

2b 


(23) 


if  a  <  0, 


a  —  —  Abe 

with  a,  b,  c  as  shown  above  for  k  =  n  —  1,  which  can  be  simplihed  as 


a  =  (A„_i  +  ^n)f{x)  -  A„_iA„/'(p), 

b  =  An-iAnf(y), 

c  =  p  +  r  =  p  + 'ipn-iiy)  -  An-i'ip'n_i{y) 

=  fiy)  -  An-ii^'^_iiy)  - 

>2 

(23)  is  usable  provided  f{y)  —  An-i'4’'n-i{y)  ~  a  ^  ®  since  otherwise  yFp  < 

>2 

Sn  or  p  is  inhnite.  Eventually  as  f{y)  0,  so  does  f{y)  —  An-i 
become  positive. 

The  foregoing  scheme  is  called  Approaching  from  the  Right  because  of 
this: 


Theorem  4  If  Xk  <  y  <  then  p  <  0  and  Xk  <  y  +  p  <  y.  If 

H  <  y  <  (ind  k  <  n,  then  p  >  0  and  y  <  Xk  <  y  +  p.  If  bn  <  y  <  Xn  and 
y  is  not  too  close  to  bn,  then  y  F  p  >  Xn- 

Proof:  It  is  clear  from  Theorems  1  and  2.  ■ 

Remark:  As  we  remarked  after  Theorem  3,  in  Theorem  4  it  is  possible 
for  y  F  p  to  jump  above  b^^i  when  bk  <  y  <  Xk-  This  must  be  prevented  in 
practice. 


2.2.3  The  Middle  Way:  a  New  Method 

A  new  way  (we  call  it  the  Middle  Way)  to  solve  the  secular  equation  f{x)  =  0 
will  be  developed  here.  Unlike  the  previous  two  ways,  it  need  not  yield 
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a  sequence  of  approximations  that  converges  monotonically  to  A^,  but  it 
converges  faster  as  we  will  see. 

We  interpolate  both  'ipkix)  and  (jjkix)  by  rationals  of  type  G(x-,S,r,s), 
taking  both  nearby  poles  into  consideration.  To  be  specihc,  when  0  <  k  <  n 
we  let 


x  +  ^  approximate  to  ^fc(a;),and 

c 

R  +  — - — —  approximate  to  (f)k(x)^ 

where 

I  ^  >0,  and  I  ^  "  ^i+i</>k(y)  >  o, 

\  r  =  ijkiy)  -  ^ki^'kiy)  <0,  1  ^  ~  ^k+i(l)'kiy)  >  o, 

(24) 

and  Afc  =  Sk  —  y  <  0  <  —  y  as  before.  We  compute  our  new 

“better”  approximation  y  +  77  to  by  solving  the  equation 


h  -  X 


R 


S 


^k+l  X 


=  0. 


(25) 


Equation  (25)  has  two  roots  a;,  one  of  them  inhnite  if  p  +  r  +  i?  =  0.  The 
root  we  need  is  the  one  between  6k  and  6k-\-i.  Set  rj  =  x  —  y.  Then  the 
correction  r]  to  y  has  forms  (20)  and  (21)  with 


a  =  (Ak  +  Ak+i)fiy)  -  AkAk+ifiy), 
b  =  AkAk+if(y), 

c  =  fiv)  -  ^ki’kiy)  -  ^k+i(t>'kiy)- 

Eventually  a  >  0. 

The  case  k  =  n  is  exactly  the  same  as  in  §  2.2.2. 

Remark:  Quadratic  convergence  of  Approaching  from  the  left  was  proved 
in  [3].  A  similar  proof  does  likewise  for  Approaching  from  the  Rights  and 
proves  that  the  Middle  Way  converges  at  least  quadratically. 


3  Secular  Equation  Solvers  (II):  A  Close  Look 

The  Middle  Way  outperforms  Approaching  from  the  Left  and  Approaching 
from  the  Right  because  it  takes  both  nearby  poles  into  considerations  while 
each  of  latter  two  uses  one  of  the  poles.  However,  as  we  shall  see,  the  Middle 
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Way  still  behaves  badly  in  a  few  circumstances.  The  following  is  a  simple 
intuitive  explanation  for  why.  In  Cj  i®  the  weight  over  the  pole  Sj  and 

controls  how  much  role  the  pole  Sj  plays  in  the  secular  function  f(x).  Most 
of  the  time  the  simple  interpolating  rational  r  +  approximates  'ipkix) 

very  well,  but  there  are  situations  when  s  overestimates  Q,  so  much  that 

s-C 

iterations  are  forced  to  move  slowly  towards  the  desired  roots.  Recall 
functions  for  the  rest  of  the  poles  in  ipkix).  Similar  things  happen  to  too. 
In  this  section,  we  shall  provide  ways  to  conquer  this  as  well  as  difficulties 
when  just  employing  two  nearby  poles  is  not  enough. 

The  case  k  =  n  will  not  be  handled  here  since  the  Middle  Way  {Ap¬ 
proaching  from  the  Right  also)  has  used  the  two  nearby  poles  on  the  left  of 
Xn  already  while  there  is  no  pole  on  its  right.  We  begin  with  a  new  look  at 
our  iteration  formulas  whose  flexibility  will  be  of  big  help. 


3.1  Iteration  Formulas 


Let  y  be  a  hxed  approximation  to  somewhere  between  Sk  and  The 

Middle  Way  is,  eventually,  based  upon  an  osculatory  interpolation  of  f{x) 
at  y  by 

Q(a;;c,s,5)  =''c+ — ^ — +- — - ,  (26) 

Sk-  X  Sk+i  -  X 

for  which 


c  + 


s 

h  -  y 


s 


S 

h+i  -  y 

S 


(h  -  yf  (4+1  -  yf 


f{y)^ 

f'iy)- 


(27) 

(28) 


However,  if  we  start  with  Q{x-,  c,  s,  S)  satisfying  (27)  and  (28),  we  cannot 
determine  Q{x-,  c,  s,  S)  uniquely  because  of  three  unknowns  and  only  two 
equations  available.  Therefore  an  additional  condition  has  to  be  imposed  in 
order  to  determine  Q{x-,  c,  s,  S).  For  the  moment,  let’s  not  worry  about  this, 
but  instead  assume  Q{x-,  c,  s,  S)  is  a  rational  of  form  (26)  with  (27)  and  (28) 
satished. 

The  idea  for  computing  a  correction  g  to  y  for  the  next  (“better”)  ap¬ 
proximation  y  g  to  Xk  is  to  solve  the  equation  Q{x-,  r,  s,  S)  =  0.  Let 


Ak  =  Sk  -  y,  Ak+i  =  4+1  -  y,  X  =  y  +  g. 
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Q{x-,  c,  s,  5)  =  0  yields 


c(Afc  -  ?7)(Afc+i  -  77)  +  s(Ak+i  -v)  +  S{^k  -  7?)  =  0, 


giving 

CT]^  —  77[c(A^:  +  A^:_l_i)  +  s  +  5]  +  cAkAk^i  +  sA^:_l_i  +  SA^  =  0.  (29) 

Proposition  3  With  c,  s  and  S  subject  to  (27)  and  (28),  we  have 

c(Afc  +  Afc_|_i)  +  5  +  5  =  (Afc  +  Ak-\-i)f{y)  —  AkAk-\-if'{y), 
CA}^A}^-\.\  -SA/j;_|_X  “1“  S  A},  -  A/j;  A/j;_|_X  y(7/)  . 


Proof:  It  follows  from  (27)  and  (28)  that 

•5  =  .  \ - {f{y)-c-  Ak+if'{y)), 

^  (J{y)-e-Akf{y)). 

^fc+i  -  k\k 

Substituting  them  into  c(Ak  +  A^+i)  +  5+5  and  cA^Afe+i  +  sA^+i  +  SA^ 
leads  to  the  desired  results.  ■ 

Proposition  3  illustrates  a  surprising  fact  that  the  iteration  formula  by 
solving  (29)  depends  only  upon  c,  alone.  As  surely,  the  equations  (27)  and 

(28)  are  solvable  for  any  c.  Without  worrying  about  global  convergence, 
any  c  will  give  rise  an  ultimately  at  least  quadratically  convergent  iteration 
scheme! 

In  the  case  s  >  0  and  5  >  0  in  which  we  are  interested,  it  follows  from 

(29)  that 


where 


a  —  \/  (j?  —  4:bc 

7]  =  -  if  a  <  0, 

'  2c  -  ’ 

(30) 

25 

=  -  :  il  a  >  0, 

a  +  V  —  45c 

(31) 

=  i^k  +  Ak+i)f(y)  -  AkAk+if'(y), 

=  AkAk+if(y). 

In  what  follows,  for  a  particular  iteration  scheme,  we  shall  provide  the  nec¬ 
essary  number  c,  as  well  as  a  proof  for  s  >  0  and  5  >  0.  Different  choices 
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of  c  give  rise  to  different  iteration  schemes.  By  choosing  c  properly,  a  very 
efficient  iteration  scheme  may  be  obtained. 

The  Middle  Way  falls  into  the  category,  i.e.,  s  >  0  and  5  >  0,  and 

c  =  f{y)  -  ^ki’kiy)  -  ^k+i(t>'kiy) 

=  f{y)- ^k+if'{y)- i^'k{y){h- h+i) 

=  f(y)  -  Akf{y)  -  ct>'k{y)(Sk+i  -  Sk). 

Reorganization  of  c  is  for  the  reader  to  see  its  differences  from  those  in  some 
other  schemes  to  which  we  are  about  to  get. 


3.2  The  Fixed  Weight  Method 

We  have  been  aware  that  the  weight  or  may  be  overestimated  in 
the  Middle  Way.  Although  not  all  overestimations  are  harmful,  there  are 
cases  where  iterations  move  slowly  because  of  overestimations.  In  order  to 
handle  these  cases  efficiently,  we  propose  the  following  scheme  so-called  the 
Fixed  Weight  Method  because  it  hxes  one  of  the  weights  ('I  and  while 
satisfying  (27)  and  (28). 

1.  The  Case  Xk  closer  to  Sk:  We  set  s  =  then 


s 

5 


c 


cl 


Sk 


A^+i  fiy)  -  ^ 


Chi 


Afc+l  ^2 


^  A2 
Sk 


hCi  > 

f(y)-Ak+if(y)-^(Sk-Sk+i). 


(32) 

(33) 


(34) 


2.  The  Case  Xk  closer  to  Sk-\-i:  We  set  S  =  Ck-\-i^  then 


5 


(ny)  -  ^ 

V  ^k+i 

Ck+  T  'CCj  >  cl 

j^k,k+l  i 


(35) 
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(36) 

(37) 


^  =  Cl+i, 

c  =  /(^)  -  Afc/'(y)  -  ^(4+1  -  4). 

^k+i 

The  following  theorem  gives  a  basic  property  of  the  interpolating  function 
Q(x-,  c,  s,  S)  and  the  next  approximation  y  +  r]  to  A^. 

Theorem  5  If  c,  s  and  S  are  defined  by  (32),  (33)  and  (34),  then  f(x)  < 
Q(x-,  c,  s,  S)  for  Sk  <  X  <  and  either  Sk<y<y  +  i]<Xk<  or 
H<y-I'n<^k<y<  H+i;  tf,  on  the  other  hand,  c,  s  and  S  are  defined  by 
(35),  (36)  and  (37),  then  f(x)  >  Q(x-,  c,  s,  S)  for  Sk  <  x  <  and  either 
h<>^k<y  +  'n<y<  h+i  or  Sk  <  y  <  Xk  <  y  +  -n  <  Sk+1 . 

3.3  Gragg’s  Scheme,  a  Scheme  of  Ultimately  Cubic  Conver¬ 
gence 

Gragg  proposed  to  choose  c,  s  and  S  so  that  Q{x-,c,s,S)  matches  f{x)  at 
y  up  to  the  second  derivative.  In  another  word,  besides  (27)  and  (28),  it  is 
also  required  that 


5 


fiy) 


{h  -  yf  (4+1  -  yf 

which,  together  with  (27)  and  (28),  yield 

ffiv)  !"{y) 


s  = 


-  Cfc  + 


Afc  —  \Ak  +  l 

ih  -  yf 


4+1  >2  ^  a2 

/ 1  ,.\.s  8*  ^  sfc  7 


-  4+1  (Sy  -  y  f 


S  = 


A^A|^  ffiy)  riy) 


(4+1  -  yf 


-  4+1  + 


E 


^k+i-h 


Ci  >  Ck-\-l^ 


C  = 


fiy)  -  (Ak  +  Ak+i)fiy)  + 


(38) 


This  scheme  needs  to  compute  the  second  derivative  of  the  secular  function 
fix).  Thus  it  needs  more  work. 
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Remark:  Gragg’s  scheme  will  yield  a  sequence  of  approximations  which 
converge  monotonically  to  the  desired  eigenvalue  for  1  <  A;  <  ra  (ref. 
[8]  and  notice  the  difference  between  their  secular  function  and  ours).  The 
interpolation  of  f(x)  for  hnding  A„  should  be  done  in  the  same  way  as  for 
hnding  A„_i,  except  monotonic  convergence  is  lost. 


3.4  Using  Three  Poles  When  Necessary:  a  Hybrid  Scheme 


The  Middle  Way  and  the  Fixed  Weight  Method  can  be  combined  to  design 
more  powerful  secular  equation  solvers  by  properly  switching  between  the 
two.  But  there  are  cases  where  three  poles  have  to  be  used  in  order  to  make 
iterations  go  faster.  For  m  =  2,  •  •  • ,  ra  —  1,  set 

n  y2 

f'm{x)  =  pF  Y. 

j  =  l,j^ra  ^ 


which  is  the  secular  function  f{x)  with  the  mth  term  in  the  summation 
removed.  It  is  easy  to  see  that  fm{x)  has  a  zero  between  hm-i  and  hm+i- 
Numerically,  we  have  discovered  the  following  cases  are  possible  difficult 
ones: 

1.  Sk  <  Xk  <  and  fk{x)  has  a  zero  between  6k  and  A^; 


2.  '^'=+^'=+1  <  Afc  <  6k+i  and  fk+i{x)  has  a  zero  between  A^  and  Sk+i- 

To  see  why  the  hrst  case  may  be  difficult,  we  let  0,  then,  as  functions 

of  Afc_i  goes  monotonically  upward  until  it  hits  Sk  while  Xk  goes  mono¬ 
tonically  downward  until  it  hits  the  zero  of  fkix)  between  6k  and  Xk-  Now, 
on  the  contrary,  let  go  back  from  zero  to  its  original  value,  what  we  will 
see  is  the  exact  contrary  phenomena.  Based  on  this  simple  observation,  we 
can  think,  roughly,  that  Afc_i  depends  largely  on  the  pole  6k  and  its  weight 
d  and,  therefore,  the  Fixed  Weight  Method  should  be  good  at  hnding  it. 
On  the  other  hand,  Xk  depends  on  the  pole  6k  and  its  weight  Q.  and  the 
zero  of  fkix)  where  it  starts  from  and  which  is  controlled  roughly  by  the  the 
poles  6k-i  and  6k-\-i  with  appropriate  weights.  Similar  intuitive  arguments 
apply  to  the  second  case  above. 

The  treatment  of  the  two  cases  is  almost  identical.  In  what  follows, 
we  discuss  the  hrst  case  only.  A  natural  way  to  handle  the  hrst  case  is  to 
interpolate  /(x)  with  the  following  simple  rational: 


Qix]  c,  s,  N)  c 


c 


^k—i  X  6k  X  6k-\-i  X 


(39) 
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The  parameters  c,  s  and  S  can  be  determined  by  either  interpolating  fkix)  = 
p  +  ipk-iix)  +  i’kix)  in  the  way  that  the  Middle  Way  does  or  in  the  way 
that  the  Fixed  Weight  Method  does  depending  on  situations.  Once  we  have 
(39),  its  zero  between  6k  and  6k-\-i  can  be  computed  in  various  iterative 
ways  with  negligible  cost.  However,  care  has  to  be  taken  in  evaluating 
Q(x-,  c,  s,  S)  at  a  given  point.  As  a  matter  of  fact,  because  of  roundoff, 
sometimes  (though  rarely)  computed  Q(y,  c,  s,  S)  differs  signihcantly  from 
computed  f(y)  though  the  two  should  be  the  same  by  interpolation  in  the¬ 
ory.  It  turns  out  that  we  can  evaluate  Q(x-,  c,  s,  S)  in  an  indirectly  way  to 
avoid  this  from  happening.  Since  cimputed  f(y)  is  available  at  the  time  we 
interpolate  the  secular  function  f(x),  we  do  not  compute  Q(y,  c,  s,  S)  at  all 
while  simply  setting  it  to  be  f(y).  At  the  very  next  time  when  we  need  to 
compute  Q(x-,  c,  s,  S)  we  update  f(y)  by  adding  a  correction  to  it  because 


Q{x-,  c,  s,  S) 


QiVi  c,  s,  S)  +  (Q(x;  c,  s,  S)  -  Q{y;  c,  s,  5)) 

((4-. 


C 


2 

k 


(h  -y)-{x-y) 


_ ^ _ ), 

{h+i  -y)-{x-y) 


Subsequent  evaluation  of  Q(x-,  c,  s,  S)  is  done  in  the  same  way  by  adding 
correction  to  its  value  at  the  previous  x. 

According  to  our  experience,  proper  treatment  of  poles  and  their  weights 
is  crucial  and  delicate,  especially  in  difficult  cases  as  we  discussed  above.  It 
accelerates  convergence  signihcantly  in  the  sense  that  it  reduces  the  over¬ 
all  average  number  of  iterations  per  eigenvalue  hnding  and  keeps  the  peak 
number  of  iterations  for  hnding  an  eigenvalue  reasonably  small,  which  is 
essential  for  avoiding  load-balancing  in  parallel  computations.  In  view  of 
this,  we  propose  the  following  scheme-the  Hybrid  Scheme  which  combines 
the  Middle  Way  and  the  Fixed  Weight  Method  in  a  clever  way  and  which  of 
course  employs  three  poles  when  necessary:  Suppose  we  are  computing 
and  suppose  6k  <  Xk  <  (6k  +  6k-\-i  )/2.  In  practice,  we  compute  Xk  —  H  instead 
of  Xk  itself.  We  have  an  initial  guess  6k +  t  for  Xk  for  which  6k  <  6k  +  T  <  Xk 
is  guaranteed  (ref.  Section  4).  The  value  of  the  secular  function  is  evaluated 
in  such  a  way 

fi^k  +  )  =  fki^k  +  )  H - —  <  0 

—  T 
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that  fk{6k  +  t)  is  obtained  as  a  by-product.  At  this  point,  we  will  make  a 
decision  between  using  two  poles  or  three  poles: 


if  fk(Sk  r)  >  0,  then 

two  poles  Sk  and  are  used; 

else 

three  poles  ,  Sk  and  are  used, 

end  if 


After  the  decision  is  made,  we  use  the  Fixed  Weight  Method  to  interpolate 
fix)  if  the  decision  favors  two  poles  or  interpolate  fkix)  if  the  decision 
favors  three  poles,  and  do  one  iteration  to  get  an  new  approximation  ti. 
If  +  "^1)  <  0  and  \f(Sk  +  ri)|  >  0.1  X  \f(Sk  +  r)|,  we  switch  to  the 
Middle  Way  for  the  next  iteration  starting  at  6k-\-Ti.  From  now  on,  just  for 
guarding,  we  compare  the  value  /new  of  the  secular  function  at  the  newest 
approximation  with  its  value  /pj-e  at  the  previous  one  and  another  switch  is 
made  from  the  current  iterative  scheme  to  the  other  one  if  /new/pre  >  0  and 
I  /new  I  >  0 . 1  X  I  /pre  |  • 

The  philosophy  behind  our  switch  making  is  that,  taking  6k  <  + 

T  <  \k  <  for  example  where  r  is  an  initial  guess,  after  hrst 

interpolation  using  the  Fixed  Weight  Method  6k  F  t  <  6k  F  ti  <  A^.  Now 
if  \f{^k  F  'Ti)!  >  0.1  X  \f{6k  F  "t)!,  it  indicates  that  iteration  is  going  slow 
with  the  Fixed  Weight  Method.  Generally,  it  is  always  a  danger  when  the 
values  of  f{x)  at  two  consecutive  approximations  have  the  same  sign  but 
the  lastest  value  improves  little  in  comparing  with  the  previous  one. 

If,  however,  we  are  computing  under  {6k  F  ^fc+i)/2  <  Xk  <  ^fc+i, 
similar  principle  can  be  applied  in  a  straightforward  way. 

Remark:  A  theorem  similar  to  Theorem  5  holds  for  Q{x-,  c,  s,  S)  if  we  use 
the  Fixed  Weight  Method  to  interpolate  fkix)  or  fk-\-i{x). 

4  Initial  Guesses 

An  iteration  that  could  ultimately  converge  quadratically,  which  is  quite 
fast,  may  get  converge  slowly  (if  at  all)  from  a  sufficiently  bad  hrst  guess. 
This  sad  possibility  becomes  a  probability  when  (k  or  /fc+i  is  tiny  compared 
with  the  other  /j’s,  in  which  case  Xk  is  very  close  to  6k  or  In  extreme 
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cases,  as  we  have  already  observed,  the  iterations  Approaching  from  the  left 
or  Right  can  even  jump  out  from  between  Sk  and 

In  what  follows  we  will  present  an  inexpensive  way  to  obtain  relatively 
accurate  initial  guesses.  At  the  same  time  we  shall  show  how  to  decide 
which  of  Sk  and  is  closes  to  A^.  This  decision  is  important  because,  if 
decided  wrongly,  roundoff  could  cause  an  iterate  to  collide  with  an  endpoint 
Sk  or  or  even  overshoot  it.  A  collision  would  soon  lead  to  Division 

by  Zero]  overshooting  could  jeopardize  subsequent  convergence.  To  avoid 
these  problems,  we  shall  translate  the  origin  temporarily,  while  is  being 
computed,  to  whichever  of  Sk  or  is  closer. 

A  natural  and  effective  way  to  make  the  correct  decision  is  to  look  at 
the  sign  of  /  jf  this  value  is  positive,  we  know  A^  is  closer  to  6k 

than  to  6k-\-i  and  thus  the  origin  should  be  translated  to  6 k]  otherwise  A^ 
is  closer  to  Sk+i  and  the  origin  should  be  translated  to  there.  If,  however, 
roundoff  obscures  the  sign  of  /  in  which  case  A^  is  almost  half 

way  between  the  two  poles,  the  origin  could  be  translated  to  either  one  of 
them  without  making  much  difference.  The  computation  of  /  can 

be  done  in  such  a  way  that  an  initial  guess  y  is  obtained  as  a  by-product. 

First  we  consider  the  case  1  <  A;  <  ra. 

Rewrite  the  secular  function  (2)  as  f{x)  =  g{x)  -\-  h(x),  where 


g(x)  =  p 


6j-X 


and  h(x)  = 


^k 


c 


k+1 


6k  X  6k-\-\  X 


(40) 


We  choose  our  initial  guess  y  to  be  that  one  of  the  two  roots  of  the  equation 

=  (41) 


lying  between  6],  and  where  g  -^^as  retained  from  the  com¬ 
putation  of  /  (Thus  it  is  a  gift!)  By  sketching  the  graph  of  the 

sum  of  the  last  two  terms  on  the  left  hand-side  of  (41),  we  can  tell  without 
any  difficulty  which  root  is  needed.  In  case  /  >  q.  Equation  (41) 

should  be  solved  for  t  =  y  —  6^,  while  in  case  /  <  Q,  it  should  be 

solved  for  t  =  y  —  Dehne  A  =  6k-\-i  —  6k  and  c  =  g  Here 

are  the  formulas  for  r: 


T  =  y  -  6  k 


a  —  —  Abe 

Ye 


if  a  <  0, 


(42) 
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=  -  :  if  a  >  0, 

a  -\-  y  a?  —  Ahc 

where  if  /  >  0, 

K  =  A;,  a  =  cA  +  {(I  +  Cl+i)?  b  =  (IA,  (43) 

and  if  /  <  0, 

K  =  k  +  1,  a  =  -cA  +  (C,l  +  Cl+i)?  b  =  (44) 


Theorem  6  If  f  >  q,  then  T  given  by  (42)  and  (43)  satisfies 

s  ^  s  I  ^  \  bk-\-i  ^ 

Vk  <  Vk  +  T  <  Ak  <  - - - J 

If  f  ("^4+1*+!)  <  0,  then  T  given  by  (42)  and  (44)  satisfies 


bk  +  bk+i 


<  <  bk+i  +  T  <  Sk+i- 


Proof:  One  way  to  see  these  is  to  sketch  graphs.  It  can  also  be  seen  by 
subtracting  one  from  the  other  of  the  following  two  equations 


Cl 


C 


fc+i 


bk  ^k 


bk+1  —  ^k 


Cl 


c 


fc+1 


-^(Afc), 

bk  +  bk+i 


bk  -  y  bk+1  -  y 


=  -g 


which  produces 

(Afc  -  y) 


42 

^k 


Cl 


fc+i 


{bk  -  ^k){bk  -  y)  (bk+i  -  ^k){bk+i  -  y) 


bk  +  bk+1 


-  Ai 


C? 


y  _ 

{b,  -  (S,  -  \k) 


(45) 


Thus  \k  —  y  has  the  same  sign  as  has  ^k+Ski-i  _  g 

For  the  case  of  A;  =  ra,  we  partition  the  secular  function  f{x)  =  g{x)-\-h{x) 
as  when  k  =  n  —  1.  An  initial  guess  y  is  obtained  basically  by  solving 

+  '  %)  =  0. 


A  detail  but  simple  analysis  yields  the  following  formula  for  t  =  y  —  by. 


19 


•  The  case:  <  A„,  i.e.,  /  <  0. 

1.  If  g  (^^2i±|2±L^  <  -h{8n+i),  then  t  =  y  -  =  z^z/p. 

2.  >  -h(^„+i),then 


(46) 


(47) 


It  can  be  proved  that  in  this  case  <  A„  <  +  r  <  ^n+i  • 

•  The  case:  ^ii±|2i+i  >  A„,  i.e.,  /  >  0.  Then  g  > 

—  h(^„+i).  We  compute  r  =  y  — by  (46)  and  (47).  It  is  easy  to  show 
that  in  this  case  +  r  <  A„  <  bdyi±L_ 


The  approach  to  compute  initial  guesses  we  just  proposed  costs  marginal 
since  to  make  program  robust  we  have  to  calculate  /  anyway. 

An  alternative  approach  proposed  by  [3]  is  to  solve  the  equation  (41)  with 
g  replaced  by  g(6k+i)  or  g(Sk)^  which  yields  two  guesses  that  hap¬ 

pen  to  be  a  lower  bound  and  an  upper  bound  for  A^.  First  of  all  their 
guesses  require  extra  work.  Second,  this  extra  work  may  not  be  worth  do¬ 
ing.  Take  the  case  A^  <  for  example.  In  this  case,  this  lower  bound 

is  worse  than  our  guess  while  the  upper  bound  may  possibly  be  greater 
than  Jn  a  divide- and- conquer  code  from  Sorensen  and  Tang^,  they 

simply  solve  the  equation  (41)  with  g  replaced  by  p  to  obtain  ini¬ 

tial  guesses  for  1  <  A;  <  ra  —  1.  They  handle  the  case  A;  =  ra  by  solving 
p  +  'tpn-ii^n+i)  T  (l^n-iiy)  =  0.  lu  auy  event,  we  believe  our  approach  should 
be  better  averagely.  Our  numerical  experience  conhrms  our  intuition. 

would  like  to  thank  Prof.  D.  Sorensen  and  Dr.  Peter  Tang  for  sending  me  their 
divide  and  conquer  code. 


20 


5  Two  Stopping  Criteria 


People  have  discovered  in  order  to  guarantee  the  computed  eigenvectors  by 
(3)  to  be  fully  orthogonal  one  must  be  able  to  compute  the  distances  between 
each  Xi  and  Sj  to  almost  full  accuracy  [10,  11].  So  simulated  “double”  double 
precision  was  invented  to  evaluate  the  value  of  the  secular  function  extra 
precisely  when  necessary.  To  guarantee  full  accuracy  of  computed  distances 
Si  —  Aj,  the  stopping  criterion  was  set  to  be 

\t]\  <  ce.m  mm{|4  -  x\, \Sk+i  -  x\},  (48) 


where  x  is  the  current  iterate,  r]  is  the  last  iterative  correction  that  was  com¬ 
puted,  Cm  is  the  machine’s  roundoff  threshold  and  c  is  fairly  small  constant 
[1,  5,  11].  (c  =  2“^  in  Sorensen  and  Tang’s  code.)  However  we  decide  to  stop 
when 

<  e-m  min{|4  -  x\, \Sk+i  -  a;|}(|?7o|  -  Ihl),  (49) 

where  r]o  is  the  correction  before  last.  This  new  stopping  criterion  often 
save  one  iteration  per  eigenvalue  found.  The  following  observation  due  to 
W.  Kahan  justihes  (49). 

Let  {xj}JLi  be  a  sequence  of  numbers,  produced  by  some  rapidly  conver¬ 
gent  iteration  scheme,  such  that  lim  Xj  =  z.  If  —  Xj\l\xj  —  Xj-i\  are 

j^oo 

decreasing  for  j  >  k,  and  if\xk^i  —  Xk\/\xk  —  <  1,  then 


\xk+i  -  z\  < 


\xk+i  - 


To  see  why  this  observation  works,  let’s  denote  7  =  \xk-\-i  —  — 

Xk-i\-  For  i  >  k,we  have 


n 


■(®fc  i)' 


which  gives  —  Xi\  <  7®  ^~^^\xk  —  a;fc-i|.  Therefore 


\xk+i  -  zj  = 


(x^  -  x,  +  i) 


<  \xk  -  Xk-i\  ^  7 
i=k-\-l 


i  —  k-\-l  _ 


^k\ 


\'^k  — 1|  '^k\ 
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as  required. 

Recently,  Gu  and  Eisenstadt  [9]  found  a  new  way  to  compute  eigenvectors 
after  all  eigenvalues  are  computed.  This  new  way  makes  the  simulated 
“double”  double  precision  unnecessary,  and  the  stopping  criterion  was  set 
to  be 

\fix)\  <nem  \P  +  J2  7“^ 

by  [9].  However,  for  large  ra,  it  may  allow  too  much  error  while  for  small  n 
there  is  a  danger  that  it  might  never  be  satished.  Note  that  not  every  term 
in  secular  equation  is  of  the  same  magnitude.  Generally,  only  two  terms 
will  dominate  all  others.  In  our  code,  we  decide  to  compute  error  bounds 
with  marginal  costs  at  each  iteration.  Our  implementation  is  as  follows:  We 
compute  'ipkix)  by  summing  up  each  term  from  j  =  1  to  A;,  while  (jjkix)  from 
j  =  ra  to  A;  +  1  (refer  to  (5))  since  hopefully  in  this  way  we  add  terms  from 
small  magnitude  to  large  ones.  Note  that  we  actually  work  with 

n  a2 

/(^JV  + 't)  =  p  +  ^  TV - (51) 

(G  -  bK)  -  T 

where  t  =  x  —  Sk  and  K  =  A;  if  comes  more  close  to  than  to  other  bj 
and  K  =  k  +  1  otherwise.  It  is  easy  to  show  that 

\f(bK  +  't)  -  (Computed /(^/V  +  t))\  < 

where 

k  t2  k+1  t2 

e  =  2p+^(A;-j  +  6)  ^  +^(i-A;  +  5)  ^  +|/(^/v  +  r)|, 

0,'  —  X  0,'  —  X 

2  =  1  ■'  3=n  ■’ 

which  can  be  computed  recursively  on  our  way  to  compute  ^^(a;)  and  (jjkix) 
and  costs  about  2ra  additions  for  each  iteration  cycle.  On  the  other  hand,  if  r 
is  a  neighbor  of  the  desired  zero  r*  to  the  function  (51),  i.e.,  |r  — r*|  <  Cmli'l, 
then 

\f(bK  +  T)-0\  <  |r  -  r*|  |/'(^/v  +  r)|  +  0(|r  -  r*p) 

<  em|r|  |/'(^/v  +  r)|  +  0(|r  -  r*p). 

In  view  of  these,  we  set  our  stopping  criterion  to  be 

\f(bK  +  i-)!  <  ee„  +  e„|r|  |/'(^/v  -  r)|  (52) 

This  stopping  criterion  is  more  reasonable  and  tighter  than  (50).  The  last 
term  in  (52)  is  usually  much  smaller  than  ee^  since  |r|  <  —  bk\/2. 
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6  Choosing  the  Right  Scheme 


With  many  different  iterative  schemes  at  hand,  which  one  is  the  right  one 
that  we  should  use  in  order  to  achieve  best  performance  that  we  could  get 
from  the  divide- and- conquer  method?  Obviously,  they  could  not  be  equally 
efficient.  Based  on  our  intuition  and  numerical  experience,  we  think  the 
Hybrid  Scheme  is  the  best  choice. 

Approaching  from  the  Left  has  been  used  for  hnding  zeros  of  the  secular 
function  (2)  for  over  a  decade  since  it  hrst  appeared  in  [3].  Why  was  it 
favored  and  why  were  not  the  other  schemes?  Two  possible  reasons  are: 

•  Only  a  small  portion  of  the  total  cost  for  Cuppen’s  divide- and- conquer 
algorithm  is  spent  on  solving  secular  equations^.  Therefore  a  faster 
secular  equation  solver  might  affect  the  overall  performance  of  the 
algorithm  so  little  that  people’s  attention  were  not  caught. 

•  Approaching  from  the  Left  yields  a  sequence  of  approximations  that 
converge  monotonically  to  the  desired  eigenvalue.  Monotonicity  is 
good,  for  example,  no  need  to  compute  error  bound  (refer  to  Section  5), 
but  simply  stop  whenever  monotonicity  is  lost. 

The  rational  interpolation  resulting  in  Approaching  from  the  Left  forces 
to  be  the  pole  of  the  interpolating  rational  G(x-,Sk^i,r,s)  for  <(>fc(a;),  while 
pushes  the  pole  to  the  left  to  p  as  the  pole  of  the  interpolating  rational 
F(x-,p,g)  for  ifkix).  The  former  is  quite  reasonable;  on  the  other  hand, 
the  latter  raises  a  question:  Are  rationals  of  form  F(x-,p,  g)  good  enough  as 
candidates  to  interpolate  ifkix),  especially  for  the  case  when  x  comes  close 
to  ^fc?  The  answer  turns  out  to  be  “No” ,  as  it  becomes  clear  from  numerical 
results  in  the  following  section.  This  gives  us  an  idea  that  Approaching  from 
the  Left  works  better  in  the  case  when  comes  closer  to  than  in  the 
case  when  closer  to  Sk. 

The  above  argument  applies  to  Approaching  from  the  Right.  And  the 
Middle  Way  should  work  perfectly  in  any  case  in  the  sense  that  it  matches 
the  better  one  of  the  two. 

The  average  number  of  iteration  of  Gragg’s  cubic  scheme,  which  needs  to 
compute  the  second  derivative  of  the  secular  function  f(x),  is  about  the  same 
as  that  of  our  Hybrid  Scheme.  But,  as  we  will  see  later,  it  does  iterates  more 

^This  is  not  true  if  only  eigenvalues  are  requested  because,  in  this  case,  the  total  cost 
is  0[rS)  [6]. 
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in  some  cases  where  poles  and  weights  have  to  be  taken  into  considerations 
more  closely. 

Few  cases  have  been  encountered  by  the  author  in  which  there  are  big 
differences  in  numbers  of  iterations  for  hnding  A„.  Our  suggestion  is  strongly 
supported  by  numerical  results  below. 

7  Numerical  Examples 

In  numerical  results  that  follows,  double  precision  arithmetic  was  used  through¬ 
out.  Both  stopping  criteria  will  be  used  for  testing.  We  will  demonstrate 
numerically 

1.  How  good  are  our  initial  guesses? 

2.  Approaching  from  the  Left  and  Right  are  bad  schemes; 

3.  The  stopping  criterion  (52)  usually  terminates  computations  1-2  iter¬ 
ations  earlier  than  (49); 

4.  Comparisons  among  the  Middle  Way,  the  Fixed  Weight  Method,  Gragg’s 
scheme  and  the  Hybrid  Scheme. 

Our  hrst  example  is  taken  from  [11]:  ra  =  4,  =  1,  ^2  =  2  — /3,  ^3  =  2-|-/3 

and  ^4  =  3|,  where  /3  =  10“^  is  a  parameter  chosen  on  purpose  to  make 
Ai  close  to  62  and  A3  close  to  62.  We  make  A2  =  2  in  exact  arithmetic, 
half  way  between  62  and  ^3,  by  letting  w'^  =  (2,/3,/3,2),  p  =  11^11”^  and 
=  (Cl,C2,C3,C4)^  =  w/||w||. 

In  the  following  table,  we  compare  the  numbers  of  iterations  taken  by 
using  Approaching  from  the  Left  with  our  initial  guesses  (the  third  row)  and 
with  those  taken  by  using  Sorensen  and  Tang’s  secular  equation  solver  code. 
Iteration  stops  whenever  (49)  is  satished,  and  extra  precise  evaluations  [11] 
of  secular  functions  are  invoked. 


Table  1:  Initial  Guess  Issue^ 


13  =  lO--" 

13  =  10-’’ 

13  =  10-"'' 

Eigenvalues 

1 

2 

3 

4 

1 

2 

3 

4 

1 

2 

3 

4 

Left 

5 

2 

13 

4 

5 

1 

24 

4 

5 

2 

37 

4 

Sorensen  &  Tang’s 

12 

15 

16 

4 

18 

26 

25 

5 

28 

37 

38 

4 

^The  initial  guess  is  not  counted.  This  is  also  the  case  for  all  following  tables. 
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This  example  shows  our  initial  guesses  provide  better  starting  points. 
But  still  the  third  eigenvalue  A3  takes  lots  of  iterations  before  required  ac¬ 
curacy  is  reached.  The  reason  is  that  A3  is  extremely  close  to  S3  and  Ap¬ 
proaching  from  the  Left  behaves  badly  in  this  situation.  In  exact  arithmetic, 
A2  =  2  lies  exactly  in  the  half  way  between  S2  and  S3.  However,  as  Table 
1  indicates,  a  few  iterations  are  still  required  even  with  our  way  of  initial 
guessing,  which  should  give  correct  result  in  this  situation  in  exact  arith¬ 
metic.  As  a  matter  of  fact,  the  corresponding  problem  that  computer  had 
seen  is  the  one  after  a  few  roundoffs.  The  second  eigenvalue  to  that  problem 
is  2-\-0{e,fn),  which,  in  general,  does  not  lie  in  the  half  way  between  rounded 
S2  and  rounded  S3.  At  the  end  of  computation,  extra  precise  evaluations 
of  the  secular  function  basically  yields  A2  as  2  -|-  77  with  g  computed  to  cer¬ 
tain  relative  accuracy  which  is  unnecessarily  lot  more  for  just  computing  the 
eigenvalue  but  is  necessary  for  computing  the  full  orthogonal  eigenvectors 
later  on  [11]. 

From  now  on,  our  way  of  initial  guessing  is  always  used  except  in  Sorensen 
and  Tang’s  code.  The  following  table  exhibits  how  many  iterations  are  re¬ 
quired  respectively  by  the  three  different  zero  hnders  we  discussed  in  Sec¬ 
tion  2.  Invoke  extra  precise  evaluation  [11]  whenever  necessary. 


Table  2:  Schemes  in  Subsection  2.2 


(]  =  lo-"" 

(]  =  10-“ 

(]  =  10-^“ 

Eigenvalues 

I2 

23 

33 

44 

I2 

23 

33 

44 

I2 

23 

33 

44 

Left 

5 

2 

13 

4 

5 

1 

24 

4 

5 

2 

37 

4 

Right 

13 

2 

5 

4 

23 

1 

6 

4 

36 

2 

5 

4 

Middle 

5 

2 

5 

4 

5 

1 

6 

4 

5 

2 

5 

4 

In  this  table  each  item  ij  in  the  second  row  says  that  the  corresponding 
column  is  for  hnding  the  ith  eigenvalue  with  origin  translated  to  Sj.  The 
hrst  column  lists  which  iteration  schemes  in  Subsection  2.2  are  being  used. 
Others  are  numbers  of  iterations  required. 

Table  2  conhrms  our  previous  speculations.  As  A2  sits  “right”  on  the 
half  way  between  S2  and  S3,  the  three  different  schemes  required  almost  the 
same  number  of  iterations;  Approaching  from  the  Left  is  suitable  for  hnding 
Ai  while  Approaching  from  the  Right  is  obviously  not  because  Ai  comes  very 
close  to  S2',  the  contrary  conclusion  holds  for  hnding  A3  because  A3  comes 
very  close  to  S3.  In  any  event,  the  Middle  Way  is  the  best  among  the  three. 

The  stopping  criterion  (52)  usually  terminates  an  eigenvalue  computing 
1-2  iterations  earlier  than  the  stopping  criterion  (49).  The  following  table 
illustrates  the  point. 
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Table  3:  (49)  vs.  (52) 


(3  =  10-'' 

(3  =  10-“ 

(3  =  10-'“ 

Eigenvalues 

I2 

23 

33 

44 

I2 

23 

33 

44 

I2 

23 

33 

44 

Middle  with  (49) 

5 

2 

5 

4 

5 

1 

6 

4 

5 

2 

5 

4 

Middle  with  (52) 

4 

0 

5 

3 

4 

0 

5 

3 

3 

0 

3 

3 

Table  4  lists  the  numbers  of  iterations  for  an  example  of  a  rank-1  per¬ 
turbed  diagonal  eigensystem  arising  in  applying  the  divide- and- conquer  al¬ 
gorithm  to  the  glued  Wilkinson  matrix,  i.e.,  the  matrices  in  Test  1  of  [11]. 
Here  ra  =  30.  The  hrst  column  refers  to  which  eigenvalue  and  the  corre¬ 
sponding  temporary  origin  translated  to.  The  2nd  to  4th  columns  refer  to 
the  three  iteration  schemes  in  Subsection  2.2  with  stopping  criterion  (49). 
The  5th  column  refers  to  the  divide- and- conquer  code  from  Sorensen  and 
Tang.  The  last  column  refers  to  the  Middle  Way  with  stopping  criterion 
(52). 
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Table  4 


Left 

Right 

Middle 

Sorensen  &  Tang 

Middle  &  (52) 

ll 

2 

2 

2 

2 

1 

22 

2 

2 

2 

3 

1 

33 

2 

2 

2 

4 

1 

44 

2 

2 

2 

4 

1 

55 

2 

2 

2 

4 

1 

2 

2 

2 

4 

1 

77 

2 

2 

2 

4 

1 

§8 

2 

2 

2 

4 

1 

99 

3 

2 

2 

5 

2 

lOlo 

2 

2 

2 

3 

1 

lll2 

4 

19 

4 

8 

2 

12i2 

28 

11 

11 

28 

9 

13i3 

3 

2 

2 

5 

1 

14i5 

4 

18 

3 

6 

2 

15i5 

27 

16 

16 

27 

15 

16i7 

2 

6 

2 

7 

1 

17i8 

3 

24 

3 

23 

2 

18i8 

34 

13 

12 

37 

10 

19i9 

7 

2 

2 

7 

1 

to 

o 

to 

2 

3 

2 

6 

1 

1  2121 

16 

17 

17 

19 

16 

to 

to 

to 

to 

7 

2 

2 

7 

1 

1  2323 

3 

2 

3 

6 

2 

to 

to 

5 

2 

2 

7 

2 

2525 

3 

3 

3 

7 

3 

2626 

5 

2 

2 

5 

1 

2727 

3 

3 

3 

3 

2 

2829 

2 

6 

2 

7 

2 

2929 

8 

7 

7 

10 

6 

3030 

1 

1 

1 

6 

0 

Table  4  has  demonstrated  all  the  points  we  have  made  about  the  schemes 
in  Section  2  and  about  our  initial  guesses  over  Sorensen  and  Tang’s.  On  the 
other  hand,  it  also  explodes  the  inability  of  the  Middle  Way  to  compute 
the  12th,  15th,  18th  and  21st  eigenvalues  efficiently.  The  reason  is  exactly 
what  we  have  made  at  the  beginning  of  Section  3  and  which  motivates  us 
to  create  the  Fixed  Weight  Method  and  the  Hybrid  Scheme.  The  following 
table  presents  numbers  of  iterations  required  by  the  Middle  Way.,  the  Fixed 
Weight  Method,  Gragg’s  scheme  and  the  Hybrid  scheme  on  hnding  these  four 
“difficult”  eigenvalues.  From  now  on,  all  tests  are  done  under  the  stopping 
criterion  (52)  (except  for  Gragg’s  scheme)  as  we  have  observed  neither  stop¬ 
ping  criterion  affects  demonstrations  on  the  efficiency  of  a  particular  scheme. 
For  Gragg’s  scheme,  we  stop  whenever  we  detect  loss  of  monotonicity,  and 
thus  last  iterations  might  be  wastes. 
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Table  5 


Eigenvalues 

12 

15 

18 

21 

Total 

Per  Eig 

Peak 

Middle 

9 

15 

10 

16 

91 

3.03 

16 

Fixed 

2 

2 

2 

2 

42 

1.40 

5 

Gragg’s 

2 

2 

3 

2 

56 

1.87 

4 

Ftybrid 

1 

1 

1 

2 

38 

1.27 

4 

Here,  “Total”  refers  to  the  total  number  of  iterations  required  by  the 
corresponding  scheme  for  hnding  all  30  eigenvalues;  “Per  Eig”  the  average 
number  of  iterations  required  per  eigenvalue;  and  “Peak”  the  largest  one 
among  numbers  of  iterations  for  hnding  each  of  the  eigenvalues.  Three 
poles  were  used  by  the  Hybrid  Scheme  for  the  12th,  15th  and  18th  eigen¬ 
values.  One  might  interpret  Table  5  in  a  wrong  way  that  the  Fixed  Weight 
Method  is  good  enough  and  there  is  no  necessity  for  developing  the  Hybrid 
Scheme.  To  discover  the  drawback  of  the  Fixed  Weight  Method,  we  invented 
another  artihcial  problem  in  which  ra  =  50  and  in  which  the  31st  eigenvalue 
is  so  special  that  the  Fixed  Weight  Method  takes  as  many  as  31  iterations 
to  compute  it.  In  that  problem,  ^32  and  ^33  agree  as  many  as  9  decimal 
digits,  while  (^‘^2  much  smaller  than  (^13.  The  A31,  however,  is  fairly  away 
from  both  of  its  nearby  poles  though  it  comes  closer  to  ^32.  In  this  case, 
setting  S  =  CI2  underestimate  the  role  played  by  ^33  and  the  rest  other  than 
^32  so  much  that  the  iterations  generated  by  the  Fixed  Weight  Method  go 
unbearably  slow.  Numerical  data  are  displayed  in  Table  6.  One  thing  we 
have  to  say  is  that  for  the  Middle  Way  the  hrst  iteration  overshoots  because 
of  roundoff  errors  in  computing  the  correction.  We  handle  this  by  restarting 
the  iteration  at  the  middle  point  between  ^31  and  ^32. 


Table  6 


Middle 

Fixed 

Gragg’s 

Ftybrid 

Eigenvalue  31 

4 

31 

10 

3 

Total 

159 

181 

150 

145 

Per  Eig 

3.18 

3.62 

3.00 

2.90 

Peak 

6 

31 

10 

5 

To  see  how  iterations  go,  we  list  values  of  the  secular  function  after  each 
iteration: 


Middle: 

Fixed: 


Gragg’s: 


Ftybrid: 


1.4D+08 

1.4D+08 


1.4D+08 


1.4D+08 


-2  .  OD+00^1 . 5D-02^3 . 6D-07 
6 . 9D+07^3 . 5D+07^1 . 7D+07 
4 . 3D+06^2 . 2D+06^1 .  lD+06 
2.7D+05^1.4D+05^  •  •  • 

6 . 9D+07^2 . 6D+07^7 . 2D+06 
1 .  OD+05^2 . 6D+03^1 .  lD+01 
9.4D-10^0.0D+00 
1 . 4D-02^4 . 2D-07^9  .  OD-16 


O.OD+00 

8.7D+06 

5.4D+05 

1.2D+06 

l.lD-02 
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The  following  is  another  table  justifying  that  including  three  most  rele¬ 
vant  poles  helps.  It  is  about  a  problem  of  ra  =  30. 


Table  7 


Eigenvalues 

12 

15 

19 

22 

Total 

Per  Eig 

Peak 

Middle 

6 

7 

6 

10 

65 

2.17 

10 

Fixed 

7 

7 

8 

7 

64 

2.13 

8 

Gragg’s 

5 

6 

6 

5 

65 

2.17 

6 

Hybrid 

3 

3 

6 

6 

53 

1.77 

6 

In  theory  all  the  schemes  we  studied  share  a  fundamental  property  that 
every  correction  is  in  the  right  direction.  To  be  more  specihc,  whenever  the 
value  of  the  secular  function  is  negative  the  next  correct  must  be  positive  and 
vice  versa.  However,  it  does  not  hold  in  the  face  of  roundoff.  To  handle  this, 
we  simply  do  one  Newton  step  instead  whenever  fatal  roundoff  in  computing 
corrections  is  detected.  In  Table  7,  for  the  Fixed  Weight  Method  the  hrst 
iterations  for  the  19th  and  22nd  eigenvalues  are  Newton  steps. 

Numerous  other  examples  generated  either  randomly  or  artihcially  have 
been  run.  The  results  turn  out  to  be  very  satisfactory.  And  randomly  gen¬ 
erated  problems  do  not  give  difficulties.  Table  8  below  lists  a  few  more  data 
on  real  rank-1  perturbed  problems  extracted  from  applying  the  divide- and- 
conquer  algorithm  to  either  randomly  generated  symmetric  tridiagonal  ma¬ 
trices  or  to  symmetric  tridiagonal  matrices  obtained  by  reducing  randomly 
generated  symmetric  dense  matrices  to  tri diagonal  forms. 


Table  8 


n  =  100 

n  =  364 

n  =  700 

Total 

Per  Eig 

Peak 

Total 

Per  Eig 

Peak 

Total 

Per  Eig 

Peak 

Middle 

175 

1.75 

10 

1084 

2.98 

6 

2125 

3.04 

7 

Fixed 

146 

1.46 

6 

1106 

3.04 

6 

2157 

7 

Gragg’s 

185 

1.85 

6 

1031 

2.83 

5 

2054 

2.93 

5 

Hybrid 

146 

1.46 

5 

1074 

2.95 

5 

2093 

2.99 

5 

It  is  worth  mentioning  an  problem  suggested  to  the  author  by  W.  B. 
Gragg:  D  =  diag  (1,  2,  •  •  • ,  ra),  p  =  1  and  2;  =  (1, 10“^,  •  •  • ,  In 

practice,  there  will  be  lots  of  deflation  for  large  n.  But  for  the  purpose  of 
testing  the  robustness  of  our  code,  we  run  it  on  this  problem  for  n  as  large 
as  100  without  deflating.  The  numerical  results  indicate  our  code  has  no 
difficulties  in  solving  the  problems. 
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8  Conclusions 


We  have  studied  different  rational  interpolations  for  the  secular  function, 
each  of  which  has  different  strong  points  and  based  upon  which  many  schemes 
have  been  proposed  and  studied.  A  proper  combination  leads  to  the  Hybrid 
Scheme  which  competes  with  Gragg’s  cubic  convergent  scheme  on  random 
problems,  however,  as  our  numerical  results  indicated,  there  are  artihcial 
problems  in  which  the  regions  of  at  least  quadratic  convergence  for  the  Hy¬ 
brid  Scheme  are  larger  than  those  of  cubic  convergence  for  Gragg’s  scheme. 
Also  the  Hybrid  Scheme  keeps  peak  number  of  iterations  relatively  small, 
which  is  extremely  helpful  for  solving  the  secular  equation  in  parallel  be¬ 
cause  the  total  time  is  determined  roughly  by  whichever  eigenvalue  takes 
the  largest  number  of  iterations. 

We  also  discussed  a  few  implementation  details  for  making  a  robust  code 
and  for  achieving  the  desired  solution  as  accurately  as  possible. 
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